Cell-probe bounds for online edit distance and other pattern matching problems

نویسندگان

  • Raphaël Clifford
  • Markus Jalsenius
  • Benjamin Sach
چکیده

We give cell-probe bounds for the computation of edit distance, Hamming distance, convolution and longest common subsequence in a stream. In this model, a fixed string of n symbols is given and one δ-bit symbol arrives at a time in a stream. After each symbol arrives, the distance between the fixed string and a suffix of most recent symbols of the stream is reported. The cell-probe model is perhaps the strongest model of computation for showing data structure lower bounds, subsuming in particular the popular word-RAM model. • We first give an Ω ( (δ log n)/(w + log log n) ) lower bound for the time to give each output for both online Hamming distance and convolution, where w is the word size. This bound relies on a new encoding scheme and for the first time holds even when w is as small as a single bit. • We then consider the online edit distance and longest common subsequence problems in the bit-probe model (w = 1) with a constant sized input alphabet. We give a lower bound of Ω( √ log n/(log log n)) which applies for both problems. This second set of results relies both on our new encoding scheme as well as a carefully constructed hard distribution. • Finally, for the online edit distance problem we show that there is an O((log n)/w) upper bound in the cell-probe model. This bound gives a contrast to our new lower bound and also establishes an exponential gap between the known cell-probe and RAM model complexities.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cell-Probe Lower Bounds for Bit Stream Computation

We revisit the complexity of online computation in the cell probe model. We consider a class of problems where we are first given a fixed pattern F of n symbols and then one symbol arrives at a time in a stream. After each symbol has arrived we must output some function of F and the n-length suffix of the arriving stream. Cell probe bounds of Ω(δ lgn/w) have previously been shown for both convo...

متن کامل

Space Lower Bounds for Online Pattern Matching

We present space lower bounds for online pattern matching under a number of different distance measures. Given a pattern of length m and a text that arrives one character at a time, the online pattern matching problem is to report the distance between the pattern and a sliding window of the text as soon as the new character arrives. We require that the correct answer is given at each position w...

متن کامل

Adaptive Approximate Record Matching

Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...

متن کامل

The complexity of computation in bit streams

We revisit the complexity of online computation in the cell probe model. We consider a class of problems where we are first given a fixed pattern or vector F of n symbols and then one symbol arrives at a time in a stream. After each symbol has arrived we must output some function of F and the n-length suffix of the arriving stream. Cell probe bounds of Ω(δ lg n/w) have previously been shown for...

متن کامل

Error Tree: A Tree Structure for Hamming & Edit Distances & Wildcards Matching

Error Tree is a novel tree structure that is mainly oriented to solve the approximate pattern matching problems, Hamming and edit distances, as well as the wildcards matching problem. The input is a text of length n over a fixed alphabet of length Σ, a pattern of length m, and k. The output is to find all positions that have ≤ k Hamming distance, edit distance, or wildcards matching with P . Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015